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NOTATION 


The  matrix  in  the  linear  vector  iteration  (Equation  (1.1)) 

Constant  vector  in  the  linear  vector  iteration 

Rectangular  matrix,  the  columns  of  which  are  successive  second 
difference  vector,  h 

Transpose  of  H 

Generalized  inverse  h 
Second  difference  vector 

Initial  second  difference  vector  of  a  sequence 

Identity  matrix 

Jordan  Canonical  form  for  A 

(Unnamed)  norm  for  vectors  (see  Equation  (4.13)) 

Euclidean  norm  for  vectors  (see  Equation  (4.14)) 

Chebyshev  norm  for  vectors  (see  Equation  (4.15)) 

Dimension  of  vector  space  =  number  of  equations 
in  iteration  system  Equation  (1.1) 

Iteration  index 

Residual  vector  (Equation  (4.7)) 

Rank  of  matrix  H  and  of  extrapolation  process 

Triangular  square  route  of  H*  H  (Equation  (4.19)) 

Rectangular  matrix  the  columns  of  which  are  successive 
first  difference  vectors  u. 

Generalized  inverse  of  U 
First  difference  vector 

First  difference  vector  resulting  from  putting 
extrapolated  vector  H  back  into  iteration 


Initial  first  difference  vector  of  a  sequence 

Generic  vector  in  m-space 
Initial  vector  in  a  sequence 

First  output  vector  of  iteration  Equation  (1.1) 
Second  output  vector 


Extrapolated  vector  (see  Equations  (3.3)  and  (4.6)) 


x  Exact  solution  vector  (see  Equations  (2.7)  and  (3.2)) 

e  Tolerance  to  be  met  by  norm  of  R 

A  Eigenvalue  of  A 

£  Auxiliary  vector  in  Equations  (3.3),  (3.4),  (4.6),  (4.7) 

FORTRAN  PARAMETERS 

ISK  Number  of  iterations  skipped  over  before  beginning  an 

extrapolation  cycle 

M  Same  as  m  above 

MFB  Controls  whether  output  of  x  of  an  extrapolation 

cycle,  for  which  tolerance  is  not  met,  is  or  is  not 
forced  back  as  input  to  next  iteration 

MON  Controls  whether  input  to  an  iteration  is  only 

the  previous  vector  XOLD,  or  whether  components 
of  the  output  vector  XNEW  are  fed  in  as  soon  as 
they  have  been  generated 

MXT  Maximum  allowed  value  of  iteration  counter  NIT  (to 

prevent  runaway  computations) 

N  Assigned  maximum  number  of  columns  of  H,  also 

maximum  allowed  rank  of  extrapolation 


NC 

NL 

NR 


Number  of  columns 
Number  of  layers 
Number  of  rows 


in  rectangular  grid  of  sample 
problem 


NIT 


Iteration  counter 


NSK 


Number  of  iterations  skipped  between  end  of  one 
extrapolation  cycle  and  beginning  of  the  next 

Same  as  e  above 


TOL 

FORTRAN  ARRAYS  AND 
DIMENSIONS 


DIF1 

(M.N+l) 

Same  as  U 

DIF2 

(M,N) 

Same  as  H 

HUTF 

(N,N+1) 

Holds  (H*H,H+Uq)  and  also  the  results  of 
triangularization  thereof 

RSID 

(M) 

Same  as  R 

XHLD 

(M) 

Holds  initial  vector  x^  through  an 
extrapolation  cycle 

XKSI 

(N) 

Same  as  £ 

XNEW 

(M) 

Output  of  iteration,  xn+^ 

XOLD 

(M) 

Input  of  iteration,  x 

ABSTRACT 


A  new  family  of  methods,  called  reduced  rank  extrap¬ 
olation,  is  developed  for  accelerating  convergence  of  the 
sequence  of  vectors  generated  during  the  iterative  solution 
of  a  system  of  m  linear  algebraic  equations  in  m  unknowns. 
Large  systems  of  this  kind  arise,  for  example,  in  the 
finite  difference  or  finite  element  solution  of  partial 
differential  equations. 

Reduced  rank  extrapolation  is  derived  from  full  rank 
extrapolation,  which  is  a  straightforward  generalization 

2 

to  vector  space  of  the  well  known  Aitken  A  -Shanks  e^, 
scalar  extrapolation.  It  is  applicable  when  the  iteration 
has  reached  a  point  where  only  a  few,  say  r,  eigenvalues 
dominate  the  situation  and  hence  only  r  difference  vectors 
can  be  linearly  independent  to  specified  tolerance.  The 
rank,  r,  is  determined  during  the  solution  of  an  auxiliary 
problem  of  best  approximation  in  vector  space,  i.e.,  "best" 
in  the  sense  of  minimizing  some  specified  vector  norm. 

Least  squares  theory,  corresponding  to  the  Euclidean  norm, 
is  developed  in  detail  herein. 

Application  to  Laplace’s  equation  in  a  square  and  in  a 
cube  yielded  reduction  in  computation  time  by  a  factor 
ranging  between  2.4  and  4.7,  and  reduction  in  iteration 
count  by  a  factor  ranging  between  3.6  and  5.4. 


ADMINISTRATIVE  INFORMATION 

This  research  was  carried  out  under  the  NAVSEA  Mathematical  Sciences 
Research  Program,  Task  Area  SR0140301,  Element  61153N,  Work  Unit  1808-010. 


1.  INTRODUCTION  AND  SUMMARY 

This  report  presents  a  new  family  of  methods,  called  reduced  rank  extrapolation, 
for  accelerating  or  producing  convergence  of  the  sequence  of  vectors  generated  in 
the  iterative  solution,  by  whatever  scheme,  of  a  system  of  linear  algebraic 
equations  in  m  variables 


xn+l  =  A  xn  +  b  (1‘1) 

where  x  and  b  are  m  x  1  vectors  and  A  is  a  (constant)  m  x  ra  matrix.  Large  systems 
of  this  type  occur  in  the  finite  difference  or  finite  element  solution  of  partial 
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differential  equations  of  elliptic  or  parabolic  type.  These  arise  in  many  important 

physical  and  engineering  problems,  such  as  those  concerned  with  the  flow  of  water 

around  ship  hulls,  the  flow  of  air  around  aircraft  and  missiles,  and  the  flux  of 

neutrons  in  a  nuclear  reactor.  Standard  methods  of  setting  up  the  iteration  matrix 

A  in  terms  of  the  original  background  equations  are  the  subject  of  such  well  known 

1*  2  3 

textbooks  as  those  of  Varga  (1962),  Wachspress  (1966),  Young  (1971),  and 
Forsythe  and  Moler  (1967). 

The  motivating  idea  behind  the  reduced  rank  methods  can  be  developed  as  follows. 
In  the  background  is  the  well  known  formula  for  extrapolating  to  the  limit  of  a 
scalar  sequence 


2 

x  =  (x  x  -X  )/(x  - -2x  +x  n) 

n-1  n+1  n  n+1  n  n-1 


2 

=  x  1  -  (x  -X  n )  / (x  ,.-2x  +x 
n-1  n  n-1  v  n+1  n  n- 


(1.2) 


due  to  Aitken  (1926) and  to  Shanks  (1949),**  (1955).^  One  derivation  of  this  formula 
can  be  generalized  in  a  straightforward  manner  to  apply  to  vector  sequences.  The 
result.  Equation  (3.2),  is  here  called  full  rank  extrapolation;  it  seems  to  be 
known,  but  not  "well  known."  (The  treatment  in  Section  3  is  taken  from  the  present 
author’s  working  notes  dated  1952.)  Full  rank  extrapolation  can  be  helpful  in  the 
iterative  solution  of  small,  mildly  nonlinear  systems.  However,  it  is  of  no  use  for 
*rge  linear  systems  because  it  requires  the  inversion  of  an  m  *  m  matrix  of  second 
difference  vectors.  This  matrix  has  the  same  size  as  the  original  system  and  is 
very  likely  to  be  much  more  ill  conditioned. 

This  situation  invites  exploration  of  how  much  can  be  accomplished  with  only  a 
few  difference  vectors — say  r  of  them,  where  1  —  r  «  m.  (This  is  the  origin  of  the 
term  "reduced  rank  extrapolation".)  The  outlook  is  hopeful  when  one  recalls  that, 
as  the  iteration  of  Equation  (1.1)  proceeds,  the  successive  become  expressible, 
to  a  given  tolerance,  in  terms  of  fewer  and  fewer  eigenvectors  of  A — i.e.,  those 
associated  with  the  eigenvalues  of  greatest  magnitude.  Under  these  conditions,  only 
a  corresponding  number  of  first  difference  vectors  or  second  difference  vectors  can 


*A  complete  listing  of  references  is  given  on  page  45. 
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be  linearly  independent — again  to  specified  tolerance.  It  is  not  known  in  advance 
just  how  many  such  vectors  will  be  needed;  the  number  is  a  function  of  the  tolerance 
imposed  and  is  determined  in  the  course  of  computation. 

The  theory  of  reduced  rank  extrapolation,  as  developed  in  Section  4,  is  both 
simple  and  elegant.  It  differs  from  most  other  methods  of  accelerating  vector 
sequences  in  two  respects:  (1)  vectors  themselves,  not  their  individual  components, 
are  regarded  as  the  basic  entities,  (2)  only  "observables , "  i.e.,  readily  computable 
quantities  such  as  iteration  vectors,  difference  vectors,  and  vector  norms  appear 
in  the  final  version  of  the  theory.  Although  eigenvalues  lurk  in  the  background, 
there  is  no  question  of  estimating  them  and  there  are  no  difficulties  associated 
with  close,  repeated,  or  degenerate  eigenvalues. 

Reduced  rank  extrapolation  can  be  described  as  follows:  let  x^  be  some 
selected  vector  in  the  sequence  generated  by  the  basic  iteration  of  Equation  (1.1), 
let  u^  =  x^  -  Xq,  and  let  x  be  the  result  of  rank  r  extrapolation.  Then  It  -  is 
expressed  as  a  linear  combination  of  r  successive  first  differences  u^,  u^,  ... 

Uf_^,  the  coefficients  being  those  by  which  (“uq)  is  "best"  represented  in  terms  of 
r  successive  second  difference  vectors  h^,  h^,  ...  h^_^.  The  three  interpretations 
of  "best  approximation"  in  terms  of  the  standard  vector  norms  J^,  and 
combined  with  the  several  computational  schemes  available  for  optimization  under 
each  of  these  norms  and  the  flexibility  in  r,  give  rise  to  the  family  of  extrapo¬ 
lation  methods  announced  in  the  title. 

There  are  two  types  of  limitations  on  the  use  of  reduced  rank  extrapolation. 

The  first  is  inescapable:  (I-A)  must  be  nonsingular  so  that  none  of  the  eigen¬ 
values  of  A  can  be  +1.  In  fact,  if  any  one  of  them  is  too  close  to  +1,  the 
extrapolated  vector  may  be  seriously  in  error.  This  comment  is  made  precise  by  the 
error  bound.  Equation  (4.12).  The  second  limitation  has  to  do  with  available 
computer  storage  capacity  for  the  arrays  of  first  and  second  difference  vectors 
required.  This  matter  will  be  taken  up  in  Section  5. 

2.  PRELIMINARIES  ON  DIFFERENCES  AND  ERRORS 

From  the  vector  iteration  Equation  (1.1)  it  is  trivial  to  show  that  the 
successive  m-component  first  difference  vectors 


(2.1) 


satisfy 


a 

n 


A  u 


n-1 


An 

A  u  r 


(2.2) 


The  second  difference  vectors  are  given  by 

hn  ‘  u„+l  -  '  <*-!>  un  (2.3) 

A  central  role  in  the  theory  to  be  developed  here  is  played  by  the  following 
ro  x  r  (1—r— m)  rectangular  matrices  whose  columns  are  first  or  second  difference 
vectors,  as  indicated: 


U  =  ^u0,ul’ • ■ ,ur-l^ 

(2.4) 

H  =  (hQ,hv  . .  .h^) 

(2.5) 

It  follows  from  Equation  (2.3)  that 


H  «  (A-I)  U 


(2.6) 


This  relation  plays  a  key  role  in  the  theory. 

Error  vectors  behave  much  like  difference  vectors.  Let  x  be  the  exact  solution 


x  =  A  x  +  b 


(2.7) 


Then  it  is  trivial  to  show  that 


*n  -  x  =  A(xu_1~x)  =  ...  =  An(xQ-x)  (2.8) 

There  is  actually  an  Intimate  relationship  between  a  difference  vector  and  the 
corresponding  error  vector: 


4 


(2.9) 


u  =  x  .,  -  x  =  A  x  +  b  -  x  =  (A-I)  x  -  (A- 1)  x 
n  n+1  n  n  n  n 


=  (A-I)  (xn-x) 


analogous  to  the  relation  in  Equation  (2.3)  between  second  and  first  order 
difference  vectors. 


3.  FULL  RANK  EXTRAPOLATION 

Let  {x^}  be  the  sequence  of  vectors  generated  by  Equation  (1.1).  Then 
formally 


xn  =  X0  +  (xrX0}  +  (X2-X1}  +  +  (VVl} 


X0  +  U0  +  U1  +  "•  +  Vl 


Xq  +  (I+A+A^+. . .+An  Uq 


xq  +  (I-A)_1  (I-An)  Uq 


If  all  eigenvalues  of  A  are  less  than  unity  in  absolute  value,  then  lim  A  =0,  and 

n~*°° 

the  vector  sequence  converges  to  the  solution: 


lim  x  -  x  =  xn  +  (I-A)  un 
n  u  u 

n-K» 


(3.1) 


Actually  it  is  trivial  to  prove,  by  substitution  into  Equation  (2.7),  that  the 
second  equality  here  is  an  identity;  it  holds  regardless  of  convergence  provided 
only  that  the  indicated  inverse  exists,  i.e.,  that  no  eigenvalue  of  A  can  be  +1. 

If  the  matrices  U  and  H  are  both  m  x  ra,  and  if  H  is  nonsingular,  then  from 
Equation  (2.6)  it  follows  that  (I-A)  ^  -  U  H  ^  and  Equation  (3.1)  becomes 


x  -  x0  -  u  ff1  uQ 


(3.2) 
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which  can  also  be  written  as  the  pair  of  equations 


+  uc 

(3.3) 

+  H  £ 

(3.4) 

In  either  of  these  forms  this  vector  extrapolation  formula  constitutes  the  full 

2 

rank  generalization  to  m-dimensional  vector  space  of  the  well  known  Aitken  A  -Shanks 
e^  formula,  Equation  (1,2),  for  extrapolating  scalar  sequences:  both  Equations  (3.2) 
and  (1.2)  are  derived  by  essentially  identical  chains  of  reasoning  and  Equation  (3.2) 
reduces  to  Equation  (1.2)  for  m  =  1. 

Although  theoretically  Equation  (3.2)  yields  the  exact  limit,  x,  it  is  not 
really  useful  for  solving  linear  systems  because  it  requires  solving  a  linear  system 
of  the  same  order  as  the  original.  However,  it  can  be  useful  in  the  iterative 
solution  of  mildly  nonlinear  systems  and  it  does  provide  a  pattern  for  the  more 
useful  reduced  rank  extrapolation. 

4.  REDUCED  RANK  EXTRAPOLATION  DEFINITION  AND  JUSTIFICATION 

As  originally  conceived,  the  term  ,freduced  rank  extrapolation’1  was  intended  to 

convey  the  idea  that  the  rank  of  the  matrices  U  and  H  appearing  in  the  extrapolation 

formulas  of  Equations  (3.2),  (3.3),  and  (3.4)  is  not  m,  the  dimension  of  the 

vector  space,  but  rather  some  smaller  integer  r:l  —  r  <  m.  In  other  words,  only  r 

successive  first  difference  vectors  urt,  ut ,  . . .  u  , ,  or  second  difference  vectors 

0  1  r-I 

h^,  h^,  ...  h^  y  are  linearly  independent,  at  least  to  the  accuracy  carried  in 
computation.  This  situation  obtains  when  the  iteration  of  Equation  (1.1)  has 
progressed  sufficiently  far  that  contributions  associated  with  the  m-r  eigenvalues 
smallest  in  magnitude  are  no  longer  significant. 

The  proof  of  this  last  statement  involves  the  reduction  of  the  iteration 
matrix  A  to  classical  (Jordan)  canonical  form  J  by  means  of  a  similarity 
transformation 

A  W  *  W  J  (4.1) 
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The  columns  of  W  are  the  eigenvectors  (or  principal  vectors)  of  A,  and  J  is  a 
diagonal  matrix  of  eigenvalues  (perhaps  with  Jordan  boxes)  arranged  in  order  of 
decreasing  magnitude.  The  eigenvalues  must  all  satisfy 

IM  <  1  (4.2) 

if  the  original  iteration  process  is  to  converge  at  all.  From  Equation  (4.1)  it 
follows  that 

An  W  =  W  Jn  (4.3) 

According  to  the  preceding  discussion,  if  n  is  large  enough,  then  all  elements  in 
some  lower  right  quadrant  of  Jn  will  be  smaller  in  magnitude  than  any  preassigned 
threshold  value.  Correspondingly,  all  main  diagonal  elements  of  the  r  x  r  upper 
left  quadrant  will  have  magnitudes  above  the  threshold  and  will  be  considered 
significant.  Now  let  u^  be  expressed  as  a  linear  combination  of  the  eigenvectors  of 
A: 

“0  =  W  yQ  (4.4) 

The  corresponding  expression  for  u^  is  obtained  by  using  Equations  (2.2)  and  (4.3): 

un  =  An  uQ  =  An  W  yQ  =  W  Jn  yQ  =  W  yn  (4.5) 

As  a  result  of  the  properties  of  Jn  already  described,  only  the  first  r  components 
of  y^  will  be  of  significant  magnitude;  thus,  u^  lies  in  a  subspace  of  dimension  r 
and  only  r  consecutive  u’s  can  be  linearly  independent.  Actually,  this  is  an 
idealized  situation;  practically,  the  distinction  between  rank  r  and  rank  r  +  1  may 
be  very  fuzzy  indeed.  Ways  of  dealing  with  this  troublesome  matter  will  be 
discussed  later  in  this  report. 


BASIC  EQUATIONS 

In  the  full  rank  extrapolation  discussed  in  Section  3,  it  was  shown  that  the 

A. 

difference  between  the  exact  solution,  x,  and  some  initial  vector,  x^,  is  a  linear 
combination  of  the  m  columns  of  U.  These,  being  linearly  independent  by  hypothesis, 
span  the  whole  ra-dimensional  vector  space*  This  simple  property  of  full  rank 
extrapolation  is  carried  over  as  closely  as  possible  to  reduced  rank  extrapolation: 
the  difference  between  the  extrapolated  vector,  It,  and  the  initial  vector,  x^,  is  a 
linear  combination  of  the  r  columns  of  U  which,  by  assumption,  are  linearly  inde¬ 
pendent.  Thus,  Equation  (3.3)  is  retained 

x  =  xQ  +  U  £  (4.6) 

with  the  change  that  the  vector  £  now  has  only  r  components  (l^rAn) ,  and  the  left 
side  is  merely  an  extrapolated  vector.  It,  not  the  exact  solution,  x.  However,  the 
companion  Equation  (3.4)  is  no  longer  available,  as  it  stands,  for  the  determination 
of  £  because  (I-A)  can  no  longer  be  completely  determined  from  H  «  (A-l)  U  as  it 
was  in  full  rank  extrapolation.  Instead,  a  new  vector, 

R  =  uQ  +  H  £  (4.7) 

called  the  residual  vector,  is  introduced,  and  £  is  chosen  to  minimize  some  norm 
of  R.  Before  the  latter  point  is  elaborated,  it  will  be  shown  that  R  has  a 
natural  and  important  interpretation  as  the  first  difference,  TT,  obtained  by 
substituting  the  extrapolated  vector,  x",  back  into  the  basic  iteration  of  Equation 
(1.1): 

R  “  uQ  +  H  £  "  (Ax0+b-xQ)  +  (A-I)  U  £ 

=  (A-I)  (x0+U£)  +  b  *  (Ax+b)  -  1c  *  TT  (4.8) 

Furthermore,  R  is  intimately  related  to  the  error,  IT  -  x,  of  the  extrapolated  vector, 
just  as  Uq  is  related  to  the  initial  error,  -  x.  As  for  the  latter,  merely 
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rearranging  Equation  (3.1)  gives 


u0  =  (A- I)  (x0-x) 


(4.9) 


Then  it  follows  easily  that 

R=u=u0+H£=  (A- I)  (Xp-x)  +  (A-I)  U  £ 

=  (A-I)  (Xp-x+x-Xp) 

=  (A-I)  (x^x)  (4.10) 

Both  of  these  equations  correspond  to  Equation  (2.9). 

VECTOR  NORMS  AND  THE  DETERMINATION  OF  £ 

AND  r 

Under  the  conditions  assumed  in  this  section  it  is  no  longer  possible  to  make 
the  residual  vector,  R,  vanish  as  it  did  in  the  full  rank  case.  The  best  that  can 
be  done  is  to  make  it  as  "small"  as  possible  by  choosing  the  vector  £  so  as  to 
minimize  some  norm  of  R  and  make  this  value  less  than  some  prespecified  tolerance: 

| |R| |  =  min  -  e  (4.11) 


There  are  three  standard  vector  norms,  &2>  anc*  which  can  he  used  to 
measure  the  magnitude  of  a  vector.  Corresponding  to  each  of  these  vector  norms 
is  a  matrix  norm  which  is  "consistent"  with  it.  This  property  of  consistency,  when, 
applied  to  Equation  (4.10),  asserts  that 
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(4.13) 


(4.14) 


m 

NMIL=max  £  lM^  1  (4.15) 

1  j-1 

Among  these  norms,  (better  known  as  the  Euclidean  norm)  is  by  far  the  best 
known  and  widely  used.  Optimization  using  it  is  the  method  of  least  squares 
introduced  by  Legendre  and  by  Gauss  around  1800.  The  vector  £  is  determined  by 
solving  a  system  of  linear  equations  of  order  r,  called  the  normal  equation. 

The  next  best  known  norm  is  the  A  norm.  Optimization  using  it  is  known  as 
Chebyshev  or  miniznax  approximation.  The  vector  £  is  determined  as  the  solution  of 
a  linear  programming  problem. 

The  least  well  known  of  these  norms  is  the  norm,  which  seems  to  have  no 
other  name.  Optimization  using  it  is  sometimes  referred  to  as  the  method  of  least 
absolute  deviations.  The  vector  £  is  determined  as  the  solution  of  a  linear 
programming  problem  in  this  case  also. 
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Thus  far  the  three  vector  norms  an<*  ^  have  been  treated  on  an  equal 

footing.  Actually,  all  of  the  author’s  computational  experimentation  has  been  with 
the  ^  norm,  simply  because  it  was  more  familiar  and  hence  easier  to  adapt  to 
present  needs.  Similar  adaptation  of  linear  programming  methods  encountered  in 
optimization  under  and  norms  should  be  carried  out.  For  a  sample  of  fairly 
recent  work  in  and  optimization,  see  the  several  papers  by  Barrodale 
et  al.^  and  by  Bartels  et  al.^*  ^ 

COMPUTATIONAL  TACTICS 

Determination  of  the  rank,  r,  of  the  system 

R  -  uQ  +  H  £  (4.7) 

|  | R |  |  *  tnin  —  e  (4.11) 

is  a  key  step  in  the  solution  of  the  extrapolation  problem.  Considerable  compu¬ 
tational  experimentation  has  convinced  the  author  that  the  best  approach  is  to  be¬ 
gin  by  letting  the  basic  iteration,  Equation  (1.1),  run  ("initial  skip")  until 
contributions  associated  with  smaller  eigenvalues  have  been  suppressed  as  described 
earlier  under  "Definition  and  Justification."  At  this  stage  the  effective  rank  of 
H  will  be  some  not-too- large  integer.  Then  begin  a  "build  up"  cycle,  i.e.,  carry 
out  a  step  by  step  procedure  in  which  the  output  of  each  new  iteration  of  Equation 
(1.1)  is  used  to  build  a  new  column  of  each  of  the  difference  matrices  U  and  H. 

As  each  new  column  is  added,  the  vector  £  is  determined  so  as  to  minimize  whatever 
norm  of  R  is  being  used.  If  the  resulting  minimized  norm  is  less  than  some 
preassigned  tolerance,  e,  then  the  solution  is  at  hand  and  the  extrapolated  solution 
vector  is  obtained  from 


"x  =  Xq  +  U  £  (4.6) 

using  the  current  x^,  U,  and  £.  The  number  of  columns  of  U  and  of  H  is  r,  the  rank 
of  the  system.  If  the  tolerance  is  not  met,  the  iteration  cycle  is  repeated  until 
the  number  of  columns  reaches  some  maximum  number,  N,  determined  in  advance  by 
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availability  of  storage  for  U  and  H.  Then  a  new  build-up  cycle  can  be  started 
either  immediately  or  after  skipping  through  some  preassigned  number  (NSK)  of  basic 
iterations  of  Equation  (1.1). 

It  is  a  matter  of  tactical  doctrine  whether  or  not  to  feed  back,  as  a  new 
starting  vector,  an  extrapolated  vector  jc  for  which  the  tolerance  on  the  norm  of  R 
has  not  been  met.  Computational  evidence,  which  appears  in  Section  5,  suggests 
that  this  is  a  minor  question,  provided  that  a  sufficiently  large  maximum  rank  is 
allowed  for  (N>2)  and  that  the  initial  skip  is  sufficient  for  the  effective  rank  of 
H  to  have  become  —  N  as  required  by  hypothesis.  The  very  worst  performance  arises 
from  forced  feedback  of  rank  one  extrapolation  right  from  the  beginning  (N=l,  ISK=0) . 

From  the  foregoing  discussion  it  should  be  clear  that  what  has  been  called  the 
rank  of  the  system  (number  of  columns  of  U  and  H,  number  of  components  of  £)  depends 
not  only  on  the  course  of  the  basic  iteration  of  Equation  (1.1),  i.e.,  how  rapidly 
it  converges,  but  also  on  the  tolerance  e  xch  is  imposed.  An  analogous  situation, 
wherein  the  solution  of  a  small  system  of  linear  equations  depends  strongly  on  the 
tolerance  imposed  on  the  residuals,  was  discussed  by  Peters  and  Wilkinson  (1970), ^ 
(especially,  pages  314-315). 

LEAST  SQUARES  OPTIMIZATION  STANDARD  THEORY 
PLUS  MODIFICATIONS 

The  rest  of  this  section  will  be  devoted  to  optimization,  the  well  known 
method  of  least  squares.  The  vector  £  in  the  optimization  problem  given  by 
Equations  (4.7),  (4.11)  is  found  as  the  solution  of  the  so-called  normal  equations 

H*  uQ  +  H*H  Z  =  0  (4.16) 

and  is  given  by 

C  =  -  H+  uQ  (4.17) 

where 

H+  ■  (H*H)-1  H*  (4.18) 
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is  the  well  known  Moore— Penrose  generalized  inverse  of  H.  (General  references: 
Ben-Israel  and  Greville  (1974),^  Boullion  and  Odell  (1971)  and  Lawson  and 
Hanson  (1974). 18 

In  view  of  the  tentative  step-by-step  procedure  advocated  in  the  preceding 

section,  one  might  be  inclined  to  use  a  step-by-step  expansion  of  H+  using 

19 

partitioned  matrices  as  was  so  carefully  discussed  by  Greville  (1960).  However, 
from  the  standpoint  of  programming  the  solution  on  a  computer,  it  is  much  simpler 
and  more  elegant  to  use  what  may  be  called  the  expanding  Choleski  method.  This 
method  involves  the  triangular  square  root,  T,  of  the  symmetric  positive  definite 
matrix  H*H.  From 


T*T  =  H*H 


(4.19) 


it  follows  that 


Tn  • 


Tij  ■  <H*HVT11 


i-1 


Tu  = 


-2T 


Thi  Thj  //Tii 


h=l 


i-1 


Tii  ii 


■z- 


hi 


h=l 


(4.20) 


The  process  of  constructing  T  produces  from  Equation  (4.16)  the  triangular  system 

T  K  +  T*"1  H*  uQ  “  0  (4.21) 


which  la  easily  solved  for  £. 
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The  expanding  nature  of  the  computation  carries  over  very  neatly  into  the 
coding  as  follows:  as  each  new  vector  (beyond  Xq,x^,X2)  is  produced  by  the  basic 
iteration,  a  new  column  is  added  to  the  difference  matrices  U  and  H.  Corresponding¬ 
ly,  a  new  row  and  a  new  column  are  added  to  H*H  and  a  new  element  (new  row)  is 

added  to  H*  un.  The  lower  triangular  matrix  T  gains  a  new  column  and  the  vector 
*_1  0 

T  H*  Uq  in  Equation  (4.21)  gains  one  more  element.  Once  any  element  of  the  array 
holding  the  system  of  Equation  (4.16)  has  been  used  in  computation,  it  is  never 
needed  again;  hence  the  same  array  can  be  overlayed  by  the  system  of  Equation 

2 

(4.21).  The  expansion  process  may  have  to  be  halted  prematurely  if  some  (T^) 
becomes  either  too  small  in  relation  to  overall  accuracy  of  computation  or  actually 
negative.  In  such  a  case  it  is  best  to  just  abort  the  current  extrapolation  cycle 
and  return  control  to  the  executive  routine. 

Geometric  Sidelights 

Use  of  the  norm  leads  to  especially  simple  relations  between  the  residual 
vector  R  =  XT',  Equation  (4.7),  and  u^  and  also  between  the  errors  of  the  extrapolated 
vector  3T  and  of  the  initial  vector  x^.  Substituting  Equation  (4.17)  into  Equation 
(4.8)  gives 

R='u=u0+HC=  (I-HH+)  uq  (4. 22) 

The  quantity  (I-HH+)  is  called  an  orthogonal  projector:  it  eliminates  any 
components  of  which  lie  in  the  subspace  spanned  by  the  columns  of  H. 

Similarly,  Equations  (4.9)  and  (4.10)  with  Equation  (4.22)  yield 

x  -  x  =  (A- 1)  *  TT  «*  (A-I)  (I-HH+)  (A- 1)  (Xq-x) 

=  (I-UU+)  (x0-x)  (4.23) 

The  last  step  Involves  the  basic  relation 

H  =  (A-I)  U  (2.6) 
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. 
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plus  a  theorem  of  Greville  (1966)  which  justifies  assuming  that 

H+  -  U+  (A-I)-1  (A. 24) 

Here  again,  in  Equation  (4.23),  is  an  orthogonal  projector  which  eliminates  any 
components  of  the  initial  error  vector  -  x  which  lie  in  the  subspace  spanned  by 
the  columns  of  U. 

Obviously,  these  projectors  are  related  by  a  similarity  transformation: 

I  -  HH+  +  (A-I)  (I-UU+)  (A-I)"1  (4.25) 


Special  Cases:  N  =  1  and  2 

The  simplest  possible  cases  of  the  foregoing  theory  occur  at  the  start  of  an 
extrapolation  cycle  when  the  vector  difference  matrices  U  and  H  have  only  one  or  two 
columns  (N=l  or  2).  Explicit  solutions  are  easily  given,  at  least  for  the  %  norm: 


N  =  1  l  -  -  h*u0/h£h0  (scalar)  (4.26) 

it  =  xQ  +  c  =  (1-0  x0  +  £  (4.27) 

Thus  the  rank  one  extrapolation  is  merely  the  well  known  relaxation  process  with  a 
ready-made  value  of  the  relaxation  parameter.  Corresponding  expressions  for  N  -  2 
are 


x  s  (1— ^^)  Xq  ^2  x2 


(4.28) 


where 


(h*hl)  (hSu0)  -  (h*h0)  (h*u0) 
n  *  “  (h*h2)  (hgh0)  -  (hjh0)  (h*h2) 


(4.29) 


As  might  be  expected,  these  simplest  cases  have  already  appeared  in  the 
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literature.  For  example,  Jennings  (1971)  gave  the  equivalent  of  Equation  (4.26) 

and  called  it  SDM  (second  difference  modulation).  He  also  discussed  what  he  called 

"double  acceleration"  which  appears  to  be  closely  related  to  the  N  =  2  case  of  the 
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present  report,  as  did  Hadjidimos  (1978). 

Earlier  Version:  Singular  Value 
Decomposition 

For  the  sake  of  completeness,  it  is  desirable  to  describe  briefly  an  earlier 

version  of  reduced  rank  extrapolation  which  has  already  appeared  in  the  literature 
23 

(Eddy  (1979))  and  to  explain  why  it  has  been  discarded. 

The  original  train  of  thought,  as  described  at  the  beginning  of  Section  4, 
led  to  attempts  to  determine  "the  rank"  of  the  matrix  H  of  second  difference  vectors 
and  then  to  solve  the  normal  Equations  (4.16)  having  this  dimension.  The  most 
powerful  tool  for  such  rank  determination  and  for  solving  the  linear  least  squares 
problem  is  the  "singular  value  decomposition" 

H  =  W  D  V*  (4.30) 

where  D  is  a  diagonal  matrix  of  (nonnegative)  eigenvalues  of  H*H  and  V  and  W  are 

orthogonal  matrices.  (General  references:  Ben-Israel  and  Greville  (1974),^ 
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Stewart  (1973)).  The  computational  aspects  of  singular  value  decomposition  were 

perfected  by  Golub  and  various  collaborators  in  a  series  of  papers  during  the  1960fs, 
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(e.g.,  Golub  and  Reinsch  (1970))  culminating  in  a  state-of-the-art  computer 

program,  CSVD,  (in  FORTRAN)  for  general  complex  matrices  (Businger  and  Golub 
26 

(1969)).  This  program  was  adapted  by  the  present  author  to  the  simpler  case  of 

real  arithmetic  only  (Computer  Program  RSVD)  and  used  for  computational  experimenta- 

23 

tion  as  reported  in  Eddy  (1979). 

The  dismaying  fact  which  emerged  from  monitoring  printouts  was  that,  at  least 
for  the  typical  sample  problem  described  in  Section  5,  there  was  no  clear-cut  break 
between  "large"  and  "small"  singular  values  of  H,  analogous  to  assumptions  explained 
at  the  beginning  of  this  section,  and  hence  no  clear-cut  rank  for  H. 
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Correspondingly,  it  became  clear  that  linear  independence  of  the  vectors 
encountered  here  is  a  rather  "fuzzy"  concept:  the  number  of  successive  vectors 
which  can  be  considered  to  be  linearly  independent  depends  very  much  on  the 
tolerance  imposed. 

Another  fact  emerged  serendipitously  from  the  copious  printouts,  something 
that  had  not  even  been  conjectured  previously,  namely  that 


R  =  u 


(4.8) 


Once  conjectured,  this  equality  was  easily  proved  and  became  the  cornerstone  of  the 
revised  approach  described  herein. 


5.  COMPUTATIONAL  EXPERIMENTS 


SAMPLE  PROBLEM 


The  sample  problem  used  to  exercise  the  comnuter  program  and  to  compare  the 
results  of  various  tactical  options  therein  is  tne  finite  difference  solution  of 
the  Laplace  equation,  in  both  a  square  (two  dimensions)  and  a  cube  (three 
dimensions)  with  zero  boundary  conditions.  Ten  internal  mesh  points  are  taken  in 
each  direction,  and  mesh  point  starting  values  are  obtained  from  a  pseudo-random 
number  generator.  This  generator  is  a  function  subroutine  built  into  the  CDC 
FORTRAN  compiler.  It  produces  the  same  sequence  of  "random"  numbers  on  each 
problem  run.  This  simple  problem  is  intended  to  be  sufficiently  realistic  but  at 
the  same  time  to  be  easy  to  code. 

COMPUTER  PROGRAM 

The  computer  program  is  intended  to  simulate  some  existing  computer  program  for 
vector  iteration  into  which  an  extrapolation  capability  has  been  inserted.  Further¬ 
more,  it  is  also  intended  that  the  program  be  easy  to  use  for  numerical  experiments 
so  certain  tactical  options  are  made  to  depend  upon  the  values  of  input  parameters. 

In  shifting  between  two-  and  three-dimensional  problems,  it  is  necessary  to 
replace  the  ordinary  iteration  subroutine,  to  change  most  of  the  dimension  state¬ 
ments,  and  to  change  the  format  statement  controlling  the  page  heading  for  output 
printouts. 
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The  program  is  organized  into  three  modules  for  which  listings  are  given  in 
the  appendix: 

(1)  the  executive  routine,  SIMP,  which  controls  initial  setup,  controls 
printouts  for  monitoring,  calls  the  ordinary  iteration  subroutine,  computes  norms 
of  difference  vector  and  tests  for  convergence,  and  calls  the  extrapolation 
subroutine. 

(2)  the  ordinary  iteration  subroutine,  P2D  or  P3D,  depending  on  whether  a  two- 
or  a  three-dimensional  problem  is  at  hand.  In  either  case  the  new  value  at  a  mesh 
point  is  just  the  mean  of  the  values  at  nearest  neighbor  points.  This  averaging  is 
under  control  of  an  input  parameter,  MON:  for  MON  =  0  only  old  mesh  values 
(components  of  XOLD)  are  used  in  computing  new  values  (gesamtschritt  =  total  step  = 
point  Jacobi  iteration) ;  for  MON  =  1  new  mesh  point  values  (components  of  XNEW)  are 
fed  in  as  soon  as  they  are  available  (einzelschritt  =  single  step  =  Gauss-Seidel 
iteration).  No  effort  was  made  to  optimize  these  subroutines  in  any  sense. 

(3)  the  extrapolation  subroutine,  XTRP,  which  builds  up  the  difference  matrices 
DIF1  (=U)  and  DIF2  (=H) ,  stores  the  vector  x^  at  the  beginning  of  a  build-up  cycle 
in  XHLD,  builds  up  the  normal  equations  and  their  triangularized  version  (both  in 
the  array  HUTF),  calculates  the  residual  vector  RSID  and  its  norms  under 

and  &  ,  tests  whether  a  selected  one  of  these  norms  meets  the  specified  tolerance 
TOL  (=e) ,  and,  if  so,  computes  the  extrapolated  vector  IT  which  then  replaces  the 
least  iteration  output  in  XNEW.  This  >T  is  then  fed  back  into  the  ordinary 
iteration;  since  iT  =  R,  the  convergence  test  there  is  met  and  computation  is  stoppe:. 

There  are  several  more  input  parameters: 

M  =  dimension  of  original  system  of  equations  =  number  of  components  in 
most  vectors.  For  the  present  sample  problem,  M  =  NR  x  NC  for  2D  problems, 

M  =  NR  x  NC  *  NL  for  3D  problems  where 

NR  =  number  of  rows 

NC  =  number  of  columns 

NL  -  number  of  layers 

in  the  rectangular  array  of  mesh  points.  All  were  equal  to  ten  in  the  sample 
problems, 

N  =  maximum  number  of  columns  allowed  for  in  DIF2  (-H)  *  maximum  rank 
extrapolation  allowed  for.  This  is  also  the  maximum  number  of  components  in  the 
auxiliary  vector  XKSI  and  the  maximum  number  of  rows  in  the  normal  equation  array 
HUTF.  Values  of  N  from  one  to  ten  were  tried. 
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MXT  *  max  i  nun  [hb  is  a  safety 

feature  to  prevent  r  %  +  - 

ISK  *  initial  •  *  i4  --n*.  -kipped  through  before  an 

extrapolation  (mild- up  * 

NSK  =  number  ot  -r>!  -  <  ?  *>  ; .  ;  » •  > :  v-roug?.  between  the  end  of  one 

extrapolation  build-up  *  vl  U*  and  t  fa  winning  of  the  next. 

MFB  controls  forced  feedback  into  r  tie  iteration  cycle  of  an  extrapolated 
vector  x  for  which  the  convergence  criterion  R  -  e  has  not  been  met. 

(No  feedback  for  MFB=0,  forced  feedback  for  MFB=1). 

Dimensions  of  all  arrays  are  specified  in  terras  of  the  two  parameters  M  and  N. 
Vectors  XOLD,  XNEW,  XHLD,  and  RSID  have  dimension  M.  Vector  XKSI  has  dimension  N. 
The  two-dimensional  arrays  are  DIF1  (M,N+1) ,  DIF2  (M,N) ,  and  HUTF  (N,N+1). 

The  amount  of  storage  required  for  these  arrays  is 

S  =  (5+2N)  M  +  N  (N+2)  (5.1) 

For  M  of  the  order  of  thousands  or  tens  of  thousands,  as  would  be  the  case  for  a 
realistic  problem  in  partial  differential  equations,  this  amount  of  extra  storage 
becomes  prohibitively  large  for  present  day  computer  systems,  even  for  small  values 
of  N.  (N  -  5  or  6  seems  to  be  optimal  for  the  prr  ent  sample  problem.) 

SUMMARY  OF  RESULTS 

Tables  1-3  show  computation  time  in  seconds  on  a  CDC  6400  and  iteration  count 
as  functions  of  two  parameters:  N,  the  maximum  rank  extrapolation,  and  ISK,  the 
numbers  of  iterations  skipped  through  before  beginning  the  first  extrapolation 
cycle.  No  complete  survey  was  attempted — only  enough  to  locate  the  parameter 
values  which  yielded  minimum  computation  times. 

Three  closely  related  problems  were  treated:  the  finite  difference  equivalents 
of  Laplace's  equation  in 

(1)  2  dimensions,  10  *  10  grid,  5-point  pattern 

(2)  2  dimensions,  10  *  10  grid,  9-point  pattern 

(3)  3  dimensions,  10  x  10  x  10  grid,  7-point  pattern 

For  each  of  these  problems  there  were  four  combinations  of  tactical  options,  not  all 
of  which  were  investigated  for  each  problem.  They  were 
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TABLE  1  (Continued) 

TABLE  IB  -  M0N=0,  MFB=1 

2 

3 

6.469 

4.721 

188 

122 

5 

6 

3.628 

3.237 

78 

66 

8 

9 

10 

3.532 

3.412 

3.210 

62 

57 

51 

10.647  2.947 


11.922  2.211 


7.645  1.577 


285 


74 


1.512 


74 


3.922  1.464 


76 


1.291 


74 


2.903 


93 


1.915 


74 


1.748 


76 


1.564 


74 


1.366 


72 


1.406 


76 


2.291 

1.931 

1.852 

1.653 

1.584 

1.387 

1.434 

70 

59 

59 

49 

47 

43 

43 

1.995 

1.619 

1.487 

1.287 

1.377 

1.337 

1.506 

66 

55 

51 

46 

47 

46 

48 

1.801 

1.421 

1.199 

1.226 

1.313 

1.376 

1.372 

65 

54 

49 

49 

50 

51 

51 

1.608  1.213  1.158  1.192 


1.476  1.165  1.188  1.216 


1.445  1.144  1.271  1.340 


1.304  1.203  1.281 


1.262  1.231  1.229 


69 


1.297 


73 


ABLE  1  (C 
ID  -  M 


4.052  2.033  1.931  2.021  1.895  1.983  2.224  2.144  2.113  2.010 


5 

6 

1.895 

1.983 

41 

40 

1.534 

1.675 

36 

38 

1.128 

1.256 

32 

32 

0.838 

0.783 

28 

27 

0.655 

0.707 

28* 

28 

0.717 

0.713 

32 

32 

0.667 

0.658 

35 

35 

2.357  1.116  1.005  0.867  0.838  0.783  0.722  0.799  0.805  0.780 


TABLE  1  (Continued) 

TABLE  IE  -  SUMMARY  FOR  2D,  5-POINT  PATTERN 
OPTIMAL  VALUES  OF  PARAMETERS 


Free  Run 

Extrapolation 

Iteration 

No  Forced 

Forced 

Input 

Feedback 

Feedback 

Vector 

MFB-O 

MFB-l 

XOLD  ONLY 

Time  =  4.803 

Min  Time  =  1.211 

Min  Time  =  1.144 

NIT  =  319 

NIT  =  67 

NIT  =  59 

MON=0 

ISK  =  60 

ISK  =  50 

N  =  6 

N  =  5 

Time  Gain  =  3.97 

Time  Gain  *  4.20 

NIT  Gain  -  4.76 

NIT  Gain  =5.41 

XOLD/XNEW 

Time  -  1.943 

Min  Time  -  0.662 

Min  Time  =  0.655 

(as  available) 

NIT  =  128 

NIT  -  35 

NIT  =  28 

MON-1 

ISK  -  30 

ISK  =  20 

N  =  4 

N  =  5 

Time  Gain  =  2.94 

Time  Gain  =2.97 

NIT  Gain  -  3.66 

NIT  Gain  =4.57 
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TABLE  2  (Continued) 
TABLE  2B  -  MON=0,  MFB=1 


TABLE  2  (Continued) 
TABLE  2C  -  MON=l,  MFB=0 


TABLE  2  (Continued) 


TABLE  2D  -  SUMMARY  FOR  2D,  9-POINT  PATTERN 
OPTIMAL  VALUES  OF  PARAMETERS 


Free  Run 

Extrapolation 

Iteration 

Input 

Vector 

No  Forced 

Feedback 

MFB=0 

Forced 

Feedback 

MFB=1 

XOLD  ONLY 

MON=0 

Time  =  3.393 

NIT  =  165 

Min  Time  =  1.071 

NIT  =  36 

ISK  =  30 

N  =  5 

Time  Gain  =  3.17 

NIT  Gain  =4.58 

Min  Time  «  1.064 

NIT  »  36 

ISK  =  30 

N  -  4 

Time  Gain  «  3.19 

NIT  Gain  =4.58 

XOLD/XNEW 

(as  available) 

MON=l 

Time  *  1.899 

NIT  -  90 

Min  Time  =  0.790 

NIT  «  25 

ISK  =  20 

N  =  3 

Time  Gain  =  2.41 

NIT  Gain  =3.60 

(Not  Done) 

28 


TABLE  3  -  COMPUTATION  TIMES  AND  ITERATION  COUNTS  3D,  7-POINT  PATTERN 

TABLE  3A  -  MON=0,  MFB=1 


V  N 

isk\ 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

0 

52.340 

44.368 

40.207 

38.889 

38.763 

36.694 

37.737 

129 

100 

84 

76 

71 

63 

60 

« 

♦ 

40 

20.163 

17.370 

17.506 

18.201 

67 

61 

59 

60  1 

50 

22.804 

18.206 

15.399 

15.867 

15.956 

16.116 

86 

70 

64* 

65 

65 

66 

60 

16.302 

15.957 

16.763 

17.399 

17.406 

73 

72 

73 

74 

74 

. 

» 

♦ 

100 

20.331 

20.057 

20.051 

109 

107 

107 

TABLE  3  (Continued) 
TABLE  3B  -  M0N=1,  MFB=0 


TABLE  3  (Continued) 


! 


TABLE  3C  -  SUMMARY  FOR  3D,  7-POINT  PATTERN 
OPTIMAL  VALUES  OF  PARAMETERS 


Free 

Run 

Extrapolation 

Iteration 

Input 

Vector 

No  Forced 

Feedback 

MFB=0 

Forced 

Feedback 

MBF=1 

XOLD  ONLY 

MON-O 

Time  = 

NIT  = 

50.092 

297 

(Not  Done) 

Min  Time  =  15.399 

NIT  =  64 

ISK  =  50 

N  =  7 

Time  Gain  =  3.25 

NIT  Gain  =4.64 

XOLD/XNEW 

Time  = 

23.621 

Min  Time  =  8.391 

(as  available) 

MON=l 

NIT  = 

134 

NIT  =  37 

ISK  =  30 

N  =  5 

Time  Gain  =  2.82 

NIT  Gain  =3.62 

(Not  Done) 

MFB  =  0  or  Is  extrapolated  vector,  for  which  convergence  criterion  was  not 
met,  was  not/was  fed  back  as  input  into  next  iteration  cycle 

MON  =  0  or  1:  input  to  iteration  is  output  of  previous  cycle/components  of 
new  output  vector  are  fed  in  as  generated. 

For  each  combination  of  tactical  options  there  is  one  table;  for  each  problem  there 
are  up  to  four  such  tables  plus  a  summary  table  showing  the  minimum  computation 
time  and  corresponding  iteration  count  as  well  as  the  gains  achieved  in  comparison 
with  free  runs  in  which  no  extrapolation  was  attempted.  (Time  gain,  for  example, 
is  the  ratio  of  free  run  time  to  least  time  attained  using  extrapolation.) 

In  all  cases  the  components  of  the  starting  vector  were  (the  same  sequence  of) 

pseudorandom  numbers  in  the  range  -0.01  —  x.  -  +0.01  and  the  criterion  for  con- 

—8  * 

vergence  was  based  on  £  5  TOL  =  10 

For  the  problems  considered,  the  optimum  value  of  N  lay  in  the  range  3  --  N  --  7, 
with  N  =  5  a  good  compromise.  The  procedure  for  determining  this  optimum  N  is  to 
make  a  series  of  runs  N  *  1,  2,  3,  all  with  ISK  =  0.  That  value  of  N  for 

which  computation  time  is  either  a  minimum  or  nearly  down  to  an  asymptotic  value 
is  either  the  desired  N  or  adjacent  to  it.  The  number  of  iterations  corresponding 
to  this  N,  for  ISK  =  0,  is  somewhat  above  the  value  of  ISK  which  yields  minimum 
time,  but  it  gives  a  starting  point  from  which  to  search  for  the  optimal  value. 

The  best  value  of  ISK  is  one  that  requires  only  one  or  two  extrapolation  cycles  to 
obtain  convergence,  because  the  extra  computation  involved  in  extrapolation  is 
relatively  costly  of  time. 

One  interesting  sidelight  on  tactical  doctrine  appears:  when  the  optimal 
values  of  N  and  ISK  are  used,  it  makes  relatively  little  difference  whether  MFB  *  1 
or  0,  i.e.,  whether  an  extrapolated  vector  IT,  for  which  the  convergence  criterion 
has  not  been  met,  is  or  is  not  fed  back  as  input  to  the  next  iteration. 

By  contrast,  it  stands  out  starkly  that  the  very  worst  tactic  is  to  force  back 
rank  one  extrapolation  (N*l)  right  from  the  start  (ISK-0) .  This  is  easy  to  under¬ 
stand:  the  conditions  under  which  rank  one  extrapolation  is  effective  have  not 
nearly  been  reached  so  that  what  actually  takes  place  Is  merely  a  succession  of 
restarts. 
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6.  OTHER  WORK  ON  EXTRAPOLATION  OF  VECTOR  SEQUENCES 

The  best  known  and  most  influential  work  on  the  extrapolation  of  vector 

27 

sequences  is  the  vector  epsilon  algorithm  of  Peter  Wynn  (1962),  It  was  put  forth 

as  a  generalization,  by  reinterpretation  of  his  elegant  scalar  epsilon  algorithm 

28 

which  he  had  proposed  (Wynn  (1956))  as  a  practical  means  of  computing  Shanks* 

transforms  (Shanks  (1955)).^  Roughly  speaking,  if  the  zeroth  column  of  the  vector 

epsilon  array  contains  the  successive  vectors  from  the  iterative  solution  of  a 

system  of  m  linear  equations,  then  each  element  of  the  2mt*1  column  is  the  solution 

vector.  The  exact  theorem,  taking  into  consideration  the  nature  of  the  minimal 

polynomial  of  the  iteration  matrix,  was  discovered  by  Brezinski  in  1972  (see 
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Brezinski  and  Rieu  (1974))  and  by  Gekeler  (1972).  For  a  summary  of  results  in 

31 

this  general  area,  see  Brezinski* s  lecture  notes  (Brezinski  (1977)).  His  FORTRAN 
implementation  of  the  vector  epsilon  algorithm,  EPSV,  plus  discussion  and  worked 
examples,  appears  in  his  textbook  on  algorithms  for  accelerating  convergence, 
Brezinski  (1978) . ^ 

Unfortunately,  the  availability  of  this  very  useful  product  of  mathematicians 

seems  to  be  too  little  known  to  engineers.  Acceleration  of  convergence  could 

result  in  appreciable  savings  in  the  cost  of  their  design  calculations.  Brezinski *s 
32 

text  is  intended  to  help  spread  the  word;  so  far  it  is  available  only  in  French. 

Just  as  in  the  1950* s,  when  the  needs  of  the  nuclear  energy  industry  provided 
impetus  for  developing  the  so-called  relaxation  methods  for  the  iterative  solution 
of  large  systems  of  linear  equations,  so  in  the  1970’ s  the  needs  of  the  aerospace 
industry,  particularly  in  solving  problems  in  transonic  flow,  spurred  several 
attempts  to  devise  methods  for  accelerating  convergence  in  the  iterative  solution 
of  large  linear  systems. 
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In  one  series  of  papers,  Martin  and  Lomax  (1975),  and  Martin  (1975)  and 
35 

(1976),  of  NASA  Ames  Laboratory,  the  authors  presented  an  ingenious  method 
involving  attaching  to  a  power  series  the  finite  difference  solutions  of  certain 
subsidiary  problems  arising  from  a  perturbation  analysis  of  the  original  partial 
differential  equations.  (The  motivation  was  their  understanding  that  Shanks  had 
shown  that  a  power  series  (even  the  first  few  terms!)  tends  to  behave  like  a 
geometric  series  and  hence  can  be  summed  by  what  they  refer  to  as  the  ’’Aitken/ 

Shanks  transform,*'  (Equation  (1.2)  of  this  report.  Thus  they  were  led  to 
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mesh-pointwise  extrapolation  of  the  first  three  perturbation  functions  as  a 

means  of  accelerating  convergence.  For  elaboration  of  these  comments,  see 
3  6 

Eddy  (1976).  Gains  in  computation  time  by  a  factor  somewhat  less  than  two 
are  reported. 
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In  another  series  of  papers,  Hafez  and  Cheng  (1975),  (1976),  (1977)  of 

the  University  of  Southern  California  at  Los  Angeles  present  the  application  to 
certain  aerodynamic  problems  of  their  modification  of  Shanks1  transforms  (or 
equivalently,  of  Wynn’s  vector  epsilon  algorithm).  Their  first  order  transform 
handles  the  situation  wherein  the  matrix  of  the  linear  approximation  to  the 
iteration  equation  has  a  single  dominant  eigenvalue,  the  second  order  transform 
copes  with  two  dominant  eigenvalues,  and  so  on.  They  achieve  a  considerable 
reduction  in  required  storage  space  by  virtue  of  their  theorem  that,  if  the 
initial  sequence  of  vectors  is  generated  by  a  linear  recursion,  then  so  also  is 
every  even-ordered  column  of  the  corresponding  vector  epsilon  array.  They  claim  a 
reduction  in  computation  time  by  a  factor  ranging  from  three  to  five. 

Many  papers  have  been  written,  by  many  authors,  on  the  choice  of  the  relaxation 
parameter,  oj,  in 


x  =  (1-u))  x  +  o)  x  (6.1) 

n  n+1 

As  has  already  been  pointed  out  in  Section  4,  rank  one  extrapolation  of  the  present 

paper  gives  a  particular  way  of  calculating  this  parameter  (Equations  (4.26), 

(4.27)).  Detailed  comparisons  of  the  relative  effectiveness  of  this  and  other 

prescriptions  have  not  been  made  and  are  not  planned.  Rank  one  extrapolation  is 

relatively  ineffective  when  compared  to  higher  rank  extrapolation,  and  much  more 

interest  attaches  to  possible  relationships  between  rank  r  extrapolation  for  r  _>  2 

and  (a)  the  successive  even-numbered  columns  of  Wynn's  vector  epsilon  array,  or 

(b)  the  higher  order  transforms  of  Hafez  and  Cheng.  These  questions  remain  to  be 

investigated.  As  has  already  been  mentioned,  something  close  to  rank  two 
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extrapolation  has  been  presented  by  Jennings  (1971)  and  by  Hadjidimos  (1978). 


After  the  manuscript  of  this  report  had  been  completed,  (except  for  the 
remainder  of  this  section),  Claude  Brezinski  pointed  out  to  the  author  that  the 

ideas  described  herein  are  close  to  those  contained  in  papers  by  Cabay  and  Jackson 

40  v  41  42  A3 

(1976),  Mesina  (1977),  Smith  and  Ford  (1980),  and  Skelboe.  The  main  idea 

40 

of  the  so-called  polynomial  method  of  Cabay  and  Jackson  can  be  sketched  as 
follows,  using  the  notation  of  the  present  paper: 

Corresponding  to  the  determination  of  the  vector  £  (Equations  (4,6),  (4.7))  is 
the  determination  of  a  pseudo  minimal  polynomial 


such  that 


s+1 

P(t)  =  £  Pj  t8  (pg+1=l,  p(l)  ^0) 

j=0 


I  |p(A>u0l I 


Then  the  extrapolated  vector  is 


s 


i=0 


(6.2) 


(6.3) 


(6.4) 


where 


c 


i 


(6.5) 


In  spite  of  differences  in  approach  and  in  mathematical  machinery  employed,  the 
polynomial  method  and  the  reduced  rank  method  are  indeed  quite  parallel  and  share 
the  same  computational  pitfalls. 
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(Uj-b)  +  b 


(6.8) 


Here  again  the  problem  is  to  determine  the  rank,  m,  and  the  coefficients, 
are  subject  to 


which 


m 

£c.  «  1  (6.9) 

j*0 

v 

Mesina’s  convergence  arguments  lead  to  the  same  sort  of  polynomials  that  Cabay 

and  Jackson  employed.  Further,  he  points  out  that  his  method  is  valid  when  a  few 

eigenvalues  of  the  iteration  matrix,  A,  have  absolute  values  greater  than  unity. 

He  has  applied  his  method  to  a  problem  in  neutron  transport  theory,  where  the 

dimension  of  the  system  of  equations  is  about  3000,  and  claims  a  reduction  of 

iterative  count  by  a  factor  ranging  from  3  to  5.  This  claim  is  quite  comparable  to 

the  claim  of  the  present  report. 

42 

Smith  and  Ford  combine  the  two  foregoing  methods  into  what  they  call  the  CJM 
method  and  apply  it  to  a  sequence  of  vectors  generated  by  a  nonlinear  iteration  in 
n  space 
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(6.10) 


u . 
1 


x . 
l 


(6.13) 


They  go  on  to  show  that  the  vector  outputs  of  a  sequence  of  iterative  cycles, 

in  which  the  output,  x;,  of  each  cycle  becomes  the  input,  xQ,  of  the  next  cycle, 

behave  exactly  like  the  sequence  of  vectors  produced  in  a  similar  fashion  with  the 

aid  of  Wynn’s  vector  epsilon  algorithm;  hence,  according  to  a  result  proved  inde- 

29  30 

pendently  by  Brezinski  and  by  Gekeler,  the  vectors  of  this  sequence  converge 
quadratically  in  the  norm  toward  the  solution  of  x  =  F(x). 

Their  paper  concludes  with  comparisons  of  the  performance  of  the  CJM  method 
with  the  performance  of  the  vector  epsilon  algorithm  on  a  set  of  small  test 
problems. 

According  to  Smith  (private  communication),  Skelbue’s  "algorithm  is  very 
similar  to  what  we  call  CJM,  and  he  proves  a  similar  quadratic  convergence  result, 
albeit  by  a  much  longer  proof." 

It  does  indeed  appear,  as  Brezinski  pointed  out,  that  the  last  few  algorithms 
described  above  are  essentially  the  same  as  the  reduced  rank  algorithm  presented 
herein.  Apparently  the  time  is  ripe  for  such  a  step  forward  in  this  area  of 
numerical  analysis. 
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To  Claude  Brezinski  I  am  indebted  for  expressions  of  interest  in  my  work  and 
for  encouragement,  and  also  for  copies  of  his  reports,  computer  programs  (including 
his  implementation  of  Wynn’s  vector  epsilon  algorithm),  and  books. 

T.M.E.  Greville,  H.K.  Cheng,  D.A.  Smith,  A.H.  van  Tuyl,  and  John  Todd  all  sent 
me  reports  or  reprints  of  their  work  in  the  theory  or  use  of  extrapolation 
techniques. 

Ian  Barrodale,  who  heard  my  oral  presentation  of  the  early  version  of  reduced 

23 

rank  extrapolation  (Eddy  (1979)),  pointed  out  to  me  the  usefulness  of  the  ^  norm 
and  sent  me  card  decks  for  his  and  optimization  programs. 

My  colleague,  Francis  Henderson,  supplied  me  with  a  card  deck  for  the  Businger- 
Golub  singular  value  decomposition  algorithm  CSVD,  which,  modified  to  handle  real 
arithmetic  only,  figured  importantly  in  one  phase  of  my  computational  experiments. 

Mary  Beth  Marquardt  and  Richard  van  Eseltine  helped  me  over  the  hurdles  of 
learning  to  use  a  computer  terminal  (in  place  of  both  runs  with  decks  of  cards)  to 
run  my  experimental  computations. 
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APPENDIX 

LISTING  OF  THE  COMPUTER  PROGRAMS 


SIMP:  Control  routine 

P3D:  Subroutine  for  iterative  solution  of  Laplace’s  equation 

XTRP:  Subroutine  for  reduced  rank  extrapolation 

These  programs  are  written  in  CDC  FORTRAN  Extended  which  is  essentially 
FORTRAN  IV. 
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LISTING  FOR  R  P  EDDY  CODE  1063 


S  APRIL  1900 


PROGRAM  SIMP t INPUT, OUTPUT, T APE5* I NPUT ,T APE6*OUTPU T ) 

NFS  *  •  TO  IGNORE  UNSUCCESSFUL  EXTRAPOLATION. 

MFB  *  1  TO  FEEDBACK  UNSUCCESSFUL  EXTRAPOLATION. 

MON  *  •  FCR  TOTAL  STEP  ITERATION.  INPUT  IS  XOLO  ONLY. 

HON  s  1  FCR  SINGLE  STEP  ITERATION.  INPUT  IS  XOLO/XNEM  AS  AVAILABLE. 

NITM  N  =  0,  THIS  OECK  IS  SET  UP  FOR  A  FREE  RUN— NO  EXTRAPOLATION. 
DIMENSION  XOLOf 1000) t  XNEN(IOQQ),  RSI1000)  FOR  PJO 

NR  *  10 


FOR  PJO 
FOR  PJO 
FOR  PJO 


NC  *  10 
NL  «  IB 
N  «  NR*NC*NL 
NPPL  *  NR*NC 
TOL  *  0.1E-T 
MXT  *  GOO 
ISK  =  0 
NSK  *  0 

N  *  0 
MFB  =  0 
MON  a  0 

10  NRITE (6,1010) 

1010  FOP HAT I 5 0M1 ACCELERATION  OF  CONVERGENCE  BY  REOUCEO  RANK  EXTR APOLATI 
ION) 

11  NRITE (6, 1011 ) 

1011  FORMAT (7SM0LAPLACE  EQUATION  IN  A  PARALLELOPIPEO  KITH  10  ROMS  10  COFOR  PJO 

1LUMNS  ANO  10  LAYERS)  FOR  PJO 

12  NRITE (6, 1012) 

1012  FORMAT  I32H  HOMOGENEOUS  BOUNOARY  CONDITIONS) 

13  NRITE (6, 1013) 

1013  FORMAT (33H  7-POIKT  PATTERN  EQUAL  HEIGHTS)  T  POINT 

1%  NRITE (6, 101 A) 

1016  FORMAT (BOH  COMPONENTS  OF  STARTING  VECTOR  ARE  PSEUOO-RANOOM  NUMBERS 
1  -0.01  .LE.  X  .LE.  »0.01) 

PARAMETER  DO  LOOP  CONTROLS  GO  HERE. 

15  NRITE (6, 1015)  ISK*NSK,N,HFB,HON 

1015  F0RMAT(11H0PARANETERS,5X,5HISK  a, I5,5X,5HNSK  *,I3,rX,3HN  a,I3,6X,5 
1HHFB  *,I2«6X«5MM0N  a, 12) 

GENERATE  RIGHT  SIOE  VECTOR,  RS 

20  00  21  I* 1,N 

21  RSII )  «  0.0 

GENERATE  STARTING  VECTOR 

PSEUOOR ANOON  NUMBERS  IN  THE  RANGE  -.01  .LE.  X  .LE.  *.01 

25  DO  26  1*1  ,M 

26  XOLO(I)  *  .02*RANF(I)  -  .01 
BEGIN  ORDINARY  ITERATION  CYCLE 
NIT  *  1 

NEX  *  ISK 

30  CALI  P30 (NR *NC,NL »NPPL» RS, XOLO*KN£M*NON)  FOR  PJO 

CALCULATE  LI,  L2,  ANO  LINF  NORMS  OF  DIFFERENCE  VECTOR 
R1  *  0.0 
R2  a  0.0 


R3  a  0.0 
35  00  37  I*1,M 

N  a  XNENI1)  -  XOLO(I) 
NA  a  ABS(N) 

R1  *  R1  ♦  NA 
R2  *  R2  ♦  N*N 


non 


I 


37  IFIWA  .GT.  R3I  R3  *  WA 
R2  *  SORT  (R2  I 

C  TEST  FOR  CONVERGENCE 

NO  IFIR2  -  TOD  N1,N1,90  12  NORN 

C  CONVERGENCE  ATTAINED 
N1  WRITE  16* ION  1 )  NIT 

10N1  FORMAT  1 2 OH  CONVERGENCE  ATTAINED*  NIT  *,I9> 

WRITE  OUTPUT  VECTOR*  ETC*  HERE. 

GO  TO  09 

90  IFCNIT  -  NEX)  70*70,60 

60  CALI  XTRPIM,N*XOLO,XNEW*TOL*NSK,NEX*NIT,MFBI 

70  00  71  Is 1 «M 

71  XOLOfll  *  XNEW(I) 

00  NIT  *  NIT  ♦  1 

IFINIT  -  MXT )  30,30,99 
99  WRITE (6*10991  NIT 
1099  FORMAT (6N0NIT  **I5,?9M  • GT.  MXT  WITHOUT  CONVERGENCE) 

99  WRITE (6, 1099) 

1099  FORMAT C9H0F IN ISMEO) 

STOP 

ENO 


j 

t 

I 
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LISTING  FCR  R  R  ECOY  C00€  1843 


•  APRIL  1988 


1 


SUBROUTINE  P3DINR»NC»NL*NPPL.RS,XOLD»XNEN,NONI 
7  POINT  PATTERN 

NON  »  0  FCR  TOTAL  STEP  ITERATION.  INPUT  IS  XOLO  ONLY. 

NON  «  1  FOR  SINGLE  STEP  ITERATION.  INPUT  IS  XOLO/XNEN  AS  AVAILABLE. 

DIMENSION  XOLOtlOOOt*  XNEMI18001  *  RSI1080I 
1*8 

00  32  KCsl.NL 
00  31  JCsl.NC 
00  38  ICsl.NR 
I  *  1*1 
M  *  RSIII 
IF(MON)  18,1$ 

18  IFIIC.NE.  1>  M*N*XNENII-ll 
IF! JC.NE.  It  N*N»XNEM<I>NRI 
IFIKC.NE.  1»  N*N*XNEMII-NPPLI 
GO  TO  20 

IS  IFtIC.NE.  II  N*N»XOLOtI>lt 
IFtJC.NE.  II  M»N*XOLOII-NRI 

IFIKC.NE.  II  M*M*XOLDII-NPPLI  j 

28  IFtrC. NE.NRI  N*M»XOLOII«lt  1 

IFIJC.NE.NO  N*M»VOLOfI»NRI  J 

IFIKC.NE. NLI  N»N*XOLOII*NPPLI  J 

30  XNENIII  =  M/6.  il 

31  CONTINUE  J 

32  CONTINUE  1 

RETURN  1 

ENO  ] 


oo  o  o  oooo  oooooo 
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SUBROUTINE  XTRPI M.N, XOLO ,XNEM,T0L ,NSK ,NEX ,NIT , MFB I 
REDUCED  RANK  EXTRAPOLATION 

THIS  VERSION  IS  F CR  USE  NITH  L2  ANO  SUCCESSIVE  TRIANGULAR  DECOMPOSITION 
ALSO  CALLED  EXPANCTNG  CHOLESKI. 

MFB  s  0  TO  IGNORE  UNSUCCESSFUL  EXTRAPOLATION. 

MFB  *  1  TO  FEEDBACK  UNSUCCESSFUL  EXTRAPOLATION. 

DIMENSION  XOLOIIOOOI,  XNEMI 10001,  XHIDI1800I 
DIMENSION  OIF 1(1000, 111 •  DIF2I10 00 , 10 » 

DIMENSION  XKSII10I,  RSIOIIOOOI,  HUTF(lO.ll) 

DATA  NCC.NCT  / 8,0 t 

BUILO  UP  FIRST  ANO  SECONO  ORDER  DIFFERENCE  MATRICES. 

CALCULATE  THE  NCC*1  COLUMN  OF  0IF1 

10  00  11  1=1, M 

11  DIF1 (I.NCCtll  =  XNEMIII  -  XOLOIII 
IFINCCI  19,19,20 
SAVE  INITIAL  VECTOR 

19  00  16  1=1, M 
16  XMLOm  =  XOLOIII 
NCC  =  NCC  ♦  1 
GO  TO  100 

CALCULATE  THE  NCC  COLUMN  OF  DIF2 

28  00  21  1=1, H 

21  DIF2(I,NCC>  *  OIFlf I,  ICC»1)  -  OIFIII.NCCI 

CALCULATE  THE  NCC  COLUMN  OF  H»M 
200  DO  206  1=1, NCC 
M  >  0.0 
DO  202  K=1»N 

202  N  *  M  ♦  DIF2IK,II*0IF2IK,NCC1 
206  HUTFI I, NCC)  *  M 
C  CALCULATE  NCC  ELEMENT  OF  H*U0 
M  >  0.0 
DO  206  K=1 * M 

206  M  *  M  ♦  OIF2 IK,NCCl*OIF 1 IK, 1 I 
HUTF I NCC ,N»1 I  =  M 

C  CALCULATE  NCC  COLUMN  OF  TRIANGULAR  SQUARE  ROOT 

IFINCC  -  1)  210,210,212 
210  HUTF 11,1)  <  SORT IHUTFIl, 111 

HUTF 1 1, Ntll  *  MUTFI1, N* 1 1 /HUTFI1, II 
GO  TO  230 

212  HUTFI1,NCCI  <  MUTFIl , NCC I /HUTFIl , II 
DO  229  I=2,NCC 
IM  «  I  -  1 
M  =  HUTFII.NCCI 
00  216  K= 1, IM 

216  M  »  M  -  HUTF IK, I I *HUTFIK*  NCCI 
IFII  -  NCC I  216,218,218 
216  HUTP 1 1, MCCI  >  M/HUT F 1 1, 1 1 
GO  TO  229 

218  IFIMI  220,220,226 
220  NRITEI6, 12201  NCY ,NCC,NIT,M 

1220  FORMAT I8MOTROU8LE ,7X,9HNCV  =,I3*  9X.9HNCC  *,I3 , 6X, 9HNIT  >,I6,6X,23H 
101 AGONAL  ELEMENT  *  SORT, El 3. 61 


FOR  »!0 
FOR  P3  0 
FOR  PSO 
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C  EXTRAPOLATE  KITH  LAST  USABLE  XKSI.  THAT  FOR  NCC-1 
NCH  a  NCC  -  1 

C  SEE  COHHENTS  FOLLOWING  LINE  72  BELOW. 

IF(MFB)  75,80 

22*  HUTF ( NCC  « NCC I  *  SORT (Ml 

225  CONTINUE 

C  CALCULATE  NCC  ELEMENT  OF  RIGHT  SIDE 
W  =  HUTF (KCC«N*1) 

DO  226  Kal,IH 

226  W  *  W  -  HUTF(K,NCCI*HUTF(K,N*1I 
HUTF ( NCC ,N»1 I  >  W /HUTF I NCC, NCC I 

CALCULATE  XKSI,  THE  VECTOR  OF  EXTRAPOLATION  COEFFICIENTS 
ROW  L  HUTF (L ,LI *XKSI(L I  ♦  SUM (HUTF  (L »L*KI#XKSI(L *KI  »  HUTF(L,N»1I  a 

1  ,LE.  K  .LE.  NCC-L  «  I 
230  XKSKNCCI  »  -HUTF (NCC, N* II /HUTF ( NCC, NCC) 

DO  23*  1=1, IN 
L  *  NCC  -  I 
W  =  -HUTF (L  «  N*ll 
00  232  K=  I, I 

232  W  *  W  -  HUTF (L,L*KI*XKSI(L*KI 
234  XKSI (LI  =  W/HUTF (L,L I 

CALCULATE  RSIO,  THE  RESIOUAL  VECTOR  ANO  ITS  NORMS  UNOER  L1*L2*  LIKF 
40  R1  *  0.0 
R2  =  0.0 
R3  =  0.0 

45  DO  48  1=1. H 
W  =  DIF 1(1,11 
00  46  J«1,NCC 

46  W  *  W  ♦  OIF2(Z,J)*XKSZIJ) 

WA  =  ABS  INI 
R1  *  R1  *  WA 
R2  =  R2  ♦  W*W 
IF (WA  .GT.  RSI  R3  =  WA 

48  RSIO (II  =  W 
R2  >  SORT (R2I 
TEST  FOR  CONVERGENCE 
NCH  =  NCC 

70  IFIR2  -  TOLI  75,75,71  L2 

ADVANCE  COLUMN  COUNTER,  NCC 

71  NCC  =  NCC  ♦  1 
IFINCC  -  N)  100,100,72 

72  IF(MFB)  75,80 

GO  TO  75  TO  FEEO  BACK  UNSUCCESSFUL  LAST  EXTRAPOLATION 
GO  TO  80  TO  IGNORE  UNSUCCESSFUL  LAST  EXTRAPOLATION 
CALCULATE  EXTRAPOLATED  VECTOR  XTLO  ANO  PUT  INTO  XNEW 

75  00  77  1=1, M 
W  a  XHLDdl 
00  76  J*1»NCH 

76  N  a  N  »  DIFlfI,JI*XKSI( Jl 

77  XNEW I II  a  M 

RESET  COLUMN  COUNTER*  NCC,  FOR  NEXT  EXTRAPOLATION  CYCLE 
80  NCC  a  0 

AOVANCE  TEST  COUNTER  FOR  NEXT  EXTRAPOLATION  CYCLE  (SIMP  501 
05  NCX  a  NEX  >  NSK  ♦  N 
100  RETURN 


TEST 
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